11 research outputs found
Clearer Frames, Anytime: Resolving Velocity Ambiguity in Video Frame Interpolation
Existing video frame interpolation (VFI) methods blindly predict where each
object is at a specific timestep t ("time indexing"), which struggles to
predict precise object movements. Given two images of a baseball, there are
infinitely many possible trajectories: accelerating or decelerating, straight
or curved. This often results in blurry frames as the method averages out these
possibilities. Instead of forcing the network to learn this complicated
time-to-location mapping implicitly together with predicting the frames, we
provide the network with an explicit hint on how far the object has traveled
between start and end frames, a novel approach termed "distance indexing". This
method offers a clearer learning goal for models, reducing the uncertainty tied
to object speeds. We further observed that, even with this extra guidance,
objects can still be blurry especially when they are equally far from both
input frames (i.e., halfway in-between), due to the directional ambiguity in
long-range motion. To solve this, we propose an iterative reference-based
estimation strategy that breaks down a long-range prediction into several
short-range steps. When integrating our plug-and-play strategies into
state-of-the-art learning-based models, they exhibit markedly sharper outputs
and superior perceptual quality in arbitrary time interpolations, using a
uniform distance indexing map in the same format as time indexing.
Additionally, distance indexing can be specified pixel-wise, which enables
temporal manipulation of each object independently, offering a novel tool for
video editing tasks like re-timing.Comment: Project page: https://zzh-tech.github.io/InterpAny-Clearer/ ; Code:
https://github.com/zzh-tech/InterpAny-Cleare
Quanta Burst Photography
Single-photon avalanche diodes (SPADs) are an emerging sensor technology
capable of detecting individual incident photons, and capturing their
time-of-arrival with high timing precision. While these sensors were limited to
single-pixel or low-resolution devices in the past, recently, large (up to 1
MPixel) SPAD arrays have been developed. These single-photon cameras (SPCs) are
capable of capturing high-speed sequences of binary single-photon images with
no read noise. We present quanta burst photography, a computational photography
technique that leverages SPCs as passive imaging devices for photography in
challenging conditions, including ultra low-light and fast motion. Inspired by
recent success of conventional burst photography, we design algorithms that
align and merge binary sequences captured by SPCs into intensity images with
minimal motion blur and artifacts, high signal-to-noise ratio (SNR), and high
dynamic range. We theoretically analyze the SNR and dynamic range of quanta
burst photography, and identify the imaging regimes where it provides
significant benefits. We demonstrate, via a recently developed SPAD array, that
the proposed method is able to generate high-quality images for scenes with
challenging lighting, complex geometries, high dynamic range and moving
objects. With the ongoing development of SPAD arrays, we envision quanta burst
photography finding applications in both consumer and scientific photography.Comment: A version with better-quality images can be found on the project
webpage: http://wisionlab.cs.wisc.edu/project/quanta-burst-photography
DisCO: Portrait Distortion Correction with Perspective-Aware 3D GANs
Close-up facial images captured at short distances often suffer from
perspective distortion, resulting in exaggerated facial features and
unnatural/unattractive appearances. We propose a simple yet effective method
for correcting perspective distortions in a single close-up face. We first
perform GAN inversion using a perspective-distorted input facial image by
jointly optimizing the camera intrinsic/extrinsic parameters and face latent
code. To address the ambiguity of joint optimization, we develop optimization
scheduling, focal length reparametrization, starting from a short distance, and
geometric regularization. Re-rendering the portrait at a proper focal length
and camera distance effectively corrects perspective distortions and produces
more natural-looking results. Our experiments show that our method compares
favorably against previous approaches qualitatively and quantitatively. We
showcase numerous examples validating the applicability of our method on
portrait photos in the wild. We will release our system and the evaluation
protocol to facilitate future work.Comment: Project website: https://portrait-disco.github.io
Privacy-Preserving Visual Localization with Event Cameras
We present a robust, privacy-preserving visual localization algorithm using
event cameras. While event cameras can potentially make robust localization due
to high dynamic range and small motion blur, the sensors exhibit large domain
gaps making it difficult to directly apply conventional image-based
localization algorithms. To mitigate the gap, we propose applying
event-to-image conversion prior to localization which leads to stable
localization. In the privacy perspective, event cameras capture only a fraction
of visual information compared to normal cameras, and thus can naturally hide
sensitive visual details. To further enhance the privacy protection in our
event-based pipeline, we introduce privacy protection at two levels, namely
sensor and network level. Sensor level protection aims at hiding facial details
with lightweight filtering while network level protection targets hiding the
entire user's view in private scene applications using a novel neural network
inference pipeline. Both levels of protection involve light-weight computation
and incur only a small performance loss. We thus project our method to serve as
a building block for practical location-based services using event cameras. The
code and dataset will be made public through the following link:
https://github.com/82magnolia/event_localization
Unveiling Charge-Separation Dynamics in CdS/Metal–Organic Framework Composites for Enhanced Photocatalysis
Photocatalytic water splitting for H2 production becomes one of the most favorable pathways for solar energy utilization, while the charge-separation dynamics in composite photocatalysts is largely elusive. In the present work, CdS-decorated metal–organic framework (MOF) composites, namely, CdS/UiO-66, have been synthesized and exhibit high H2 production activity from photocatalytic water splitting, far surpassing the MOF and CdS counterparts, under visible light irradiation. Transient absorption (TA) spectroscopy has been adopted in this report to unveil the charge-separation dynamics in CdS/UiO-66 composites, a key process that dictates their function in photocatalysis. We show that, in addition to the preferable formation of fine CdS particles assisted by the MOF, effective electron transfer, which occurs from excited CdS to UiO-66, significantly inhibits the recombination of photogenerated charge carriers, ultimately boosting the photocatalytic activity for H2 generation. This report on charge-separation dynamics for CdS–MOF composites affords significant insights for future fabrication of advanced composite photocatalysts